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CELLOBIOHYDROLASE I GENE AND IMPROVED VARIANTS 

The United States Government has rights in this invention under contract number DE- 
AC36-99G0- 10337 between the United States Department of Energy and the National 
Renewable Energy Laboratory, a division of the Midwest Research Institute. 

The present appHcation claims priority to PCT Application PCT/USOO/19007 filed July 
13, 2000, which is hereby incorporated by reference. PCT/USOO/19007 claims priority to U.S. 
Provision. Application 60/143,71 1 filed July 13, 1999. 

Field of the Invention. 

This invention relates to 1,4- p-cellobiohydrolases or exoglucanases. More specifically, it 
relates to the Trichoderma reesei cellobiohydrolase I gene, the creation of reduced glycosylation 
variants of the expressed CBH I protein to enable the expression of active enzyme in 
heterologous hosts, and to the creation of new thermal stabile variants of the enzyme that instill 
higher thermal tolerance on the protein and improved performance. 

Background of the Invention. 

The surface chemistry of acid pretreated-biomass, used in ethanol production, is different 
from that found in plant tissues, naturally digested by fungal cellulase enzymes, in two important 
ways: (1) pretreatment heats the substrate past the phase-trainsition temperature of lignin; and (2) 
pretreated biomass contains less acetylated hemicellulose. Thus, it is believed, that the cellulose 
fibers of pretreated-biomass are coated with displaced and modified lignin. This alteration results 
in a non-specific binding of the protein with the biomass, which impedes enzymatic activity. 
Moreover, where the pretreated biomass is a hardwood-pulp it contains a weak net-negatively 
charged surface, which is not observed in native wood. Therefore, for the efficient production of 
ethanol fi'om a pretreated biomass such as com stover, wood or other biomass it is desirable to 
enhance the catalytic activity of glycosyl hydrolases specifically the cellobiohydrolases. 

Trichoderma reesei CBH I (SEQ ED NO: 5) is a mesophilic cellulase which plays a major 
role in the hydrolysis of cellulose. An artificial temary cellulase system consisting of a 90:10:2 
mixture of T reesei CBH I, Acidothermus cellulolyticus EI, and Aspergillus niger P-D- 
glucosidase is capable of releasing as much reducing sugar fi-om pretreated yellow poplar as the 
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native T reesei system after 120 h. This result is encouraging for the ultimate success of 
engineered cellulase systems, because this artificial enzyme system was tested at 50°C, a 
temperature far below that considered optimal for EI, in order to spare the more heat labile 
enzymes CBH I and p-D-glucosidase. To increase the efficiency of such artificial enzyme 
systems it is desirable to engineer new T, ressei CBH I variant enzymes capable of active 
expression in heterologous hosts. The use of the heterologous host Aspergillus awamorU could 
provide an excellent capacity for synthesis and secretion of T, reesei CBH I because of its ability 
to correctly fold and post-translationally modify proteins of eukaryotic origin. Moreover, A, 
awamori is believed to be an excellent test-bed for Trichoderma coding sequences and resolves 
some of the problems associated with site directed mutagenesis and genetic engineering in 
Trichoderma. 

In consideration of the foregoing, it is therefore desirable to provide variant cellulase 
enzymes having enzymatic activity when expressed in a heterologous host, and to provide 
variant cellulase enzymes that have improve thermal tolerance over the native as produced by 

Trichoderma reesei. 

Summary of the Invention. 

It is a general object of the present invention to provide variant cellulase enzymes having 
enzymatic activity when expressed in a heterologous host, such as a filamentous fungi or yeast. 

Another object of the invention is to provide a variant exoglucanases characterized by a 
reduction in glycosylation when expressed in a heterologous host. 

Another object of the invention is to provide an active cellobiohydrolase enzyme capable 
of expression in heterologous fungi including yeast. 

Another object of the invention is to provide improved thermal tolerant variants of the 
cellobiohydrolase enzyme capable of functioning at increased process temperatures. 

It is yet another object of the invention to provide a method for reducing the 
glycosylation of a cellobiohydrolase enzyme for expression in a heterologous host. 

The foregoing specific objects and advantages of the invention are illustrative of those 
which can be achieved by the present invention and are not intended to be exhaustive or hmiting 
of the possible advantages which can be realized. Thus, those and other objects and advantages 
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of the invention will be apparent from the description herein or can be learned from practicing 
the invention, both as embodied herein or as modified in view of any variations which may be 
apparent to those skilled in the art. 

Briefly, the invention provides a method for making an active cellobiohydrolase in a 
heterologous host, the method comprising reducing glycosylation of the cellobiohydrolase, 
reducing glycosylation further comprising reducing an N-glycosylation site amino acid residue 
with a non-glycosyl accepting amino acid residue. The invention further provides a 
cellobiohydrolase, comprising the reduced glycosylation variant cellobiose enzymes CBHI- 
N45A; CBHI-N270A; or CBHI-N384A, or any combination thereof. 

Detailed Description of the Invention. 

Unless specifically defined otherwise, all technical or scientific terms used herein have 
the same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Although any methods and materials similar or equivalent to those described 
herein can be used in the practice or testing of the present invention, the preferred methods and 
materials are now described. 

The terms "native" and "wild-type" are used interchangeably throughout this disclosure to 
indicate the origin of the molecule as it occurs in nature. 

A method for reducing the glycosylation of an expressed Trichoderma reesei CBHI 
protein by site-directed mutagenesis ("SDM") is disclosed. The method includes replacing an N- 
glycosylation site amino acid residue, such as asparagines 45, 270, and/or 384 (referenced herein 
as CBHI-N45A, CBHI-N270A and CBHI-N384A, respectively), with a non-glycosyl accepting 
amino acid residue, such as is alanine. Various mutagenesis kits for SDM are available to those 
skilled in the art and the methods for SDM are well known. The description below discloses a 
procedure for making and using CBHI variants: CBHI-N45A (SBQ ED NO: 6); CBHI-N270A 
(SEQ ID NO: 7); and CBHI-N384A (SEQ ID NO: 8). The examples below demonstrate the 
expression of active CBHI in the heterologous fungus Aspergillus awaniori. 

Variants of CBH I embodiments include mutations that provide for improved end product 
inhibition and for thermal tolerance. 
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Brief Description of the Figures. 

Figure 1. Coding sequence of the cbh 7 gene (SEQ ID NO: 4). Lower case letters 
represent the signal sequence, upper case letters the catalytic domain, bolded italics the linker 
region, and upper case underlined the cellulose-binding domain. 

Figure 2. SDS-PAGE Western blot with anti-CBH I antibody showing the reduction on 
molecular weight of rCBH I expression clones as a function of introduction of N to A 
modifications. 

Figure 3. Plasmid map for the fungal expression vector pPFE2/CBH L 

Figure 4. Nucleotide sequence SEQ ID NO: 1, 5'- 
CCTCCCGGCGGAAACCCGCCTGGCACCACCACCACCCGCCGCCCA-3', coding for the 
linker region, PPGGNPPGTTTTRRP (SEQ ID NO: 2), of the CBH I protein, showing additional 
proline residues that effect conformation of the linker region in the protein structure. 

Example L Acquisition of the CBH I Encoding Sequence. 

Acquisition of the gene was done by either cDNA cloning or by PCR of the gene from 
genomic DNA. CBH I cDNA was isolated from a J. reesei strain RUT C-30 cDNA library 
constructed using a PCR-generated probe based on published CBH I gene sequences 
(Shoemaker, et a!., 1983). The cDNA*s were cloned (using the Zap Express cDNA kit from 
Stratagene; cat. #200403) into the Xhol and EcoRI site(s) of the supplied, pre-cut lambda arms. 
An Xhol site was added to the 3' end of the cDNA during cDNA synthesis, and sticky-ended RE 
linkers were added to both ends. After Xhol digestion, one end has an Xhol overhang, and the 
other (5* end) has an Eco RI overhang. The insert can be removed from this clone as an 
approximately 1.7 kb fragment using Sail or Spel plus Xhol in a double digest. There are two 
Eco RI, one Bam HI, 3 Sad and one Hindlll sites in the coding sequence of the cDNA itself. 
The plasmid corresponding to this clone was excised in vivo from the original lambda clone, and 
corresponds to pB210-5A. Thus, the cDNA is inserted in parallel with a Lac promoter in the 
pBK-CMV parent vector. Strain pB210-5A grows on LB + kanamycin (50 jig/mL). 

Acquisition of the cbh J gene was also achieved by PCR of genomic DNA. With this 
approach the fungal chromosomal DNA from T. reesei strain Rut C-30 was prepared by grinding 
the fungal hypae in liquid nitrogen using a mortar and pestle to a fine powder. The genomic 
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DNA was then extracted from the cell debris using a Qiagen DNAeasy Plant Mini kit. 
AmpUfication of the DNA fragment that encodes for the cbh 1 gene, including introns, was 
performed using polymerase chain reaction (PCR) with specific primers for the T. reesei cbh 1 
gene. The primers 5'-AGAGAGTCTAGACACGGAGCTTACAGGC-3' (SEQ ID NO: 9) that 
introduces a Xba I site and the primer 5'- 

AAAGAAGCGCGGCCGCGCCTGCACTCTCCAATCGG-3* (SEQ ID NO: 97) that introduces 
a unique Not I site were used to allowing cloning into the pPFE Aspergillus/E. coli shuttle 
vectors that are described below. The amplified PCR product was then gel purified and cloned 
directly into the vectors. 

Example 2. Production of Active Recombinant CBH I (rCBH I) in Aspergillus awamori. 
Construction of the Fungal Expression Vectors pPFE-l/CBH I and dPFE-2/CBH I, 

The coding sequence for T reesei CBH I was successfully inserted and expressed in 
Aspergillus awamori using the fiingal expression vector pPFE2 (and pPFEl). Vectors pPFEl and 
pPFE2 are E, coli - Aspergillus shuttle vectors, and contain elements required for maintenance in 
both hosts. Both pPFE-1 and pPFE-2 vectors direct the expression of a fusion protein with a 
portion of the glucoamylase gene fused to the gene of interest. The pPFEl vector contains a 
region of the glucoamylase gene, with expression under the control of the A. awamori 
glucoamylase promoter. The protein of interest is expressed as a fiision protein with the secretion 
signal peptide and 498 amino acids of the catalytic domain of the glucoamylase protein. The 
majority of the work presented here was done using the pPFE2 expression vector, which was 
chosen because of its smaller size, simplifying the PCR mutation strategy by reducing extension 
time. 

The major features of the pPFE2-CBHl construct are shown in Figure 3. With both the 
pPFEl/CBHl and the pPFE2/CBHl vectors, the sequence immediately upstream of the Not I site 
encodes a LysArg dipeptide. A host KEX-2 like protease recognizes this dipeptide sequence 
during the secretion process, and the fiision peptide is cleaved, removing the glucoamylase 
secretion signal peptide or the longer catalytic domain of glucoamylase in the case of pPFEl. In 
this way, the recombinant CBH I protein experiences an "efficient ride" through the A. awamori 
secretion system and is expressed with the native N-terminal protein. The net result is that the 
recombinant CBH I is processed so that it can accumulate in the medium without its 
glucoamylase secretion signal fiision partner. The vector contains the Streptoalloteichus 
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hindustanus phleomycin resistance gene, under the control of the A, niger p-tubulin promoter, 
for positive selection of Aspergillus transformants. The pPFE/CBHl vector also contains a |3- 
lactamase gene for positive selection using ampilcillin in E. coli, and also contains the A, niger 
trpC terminator. The insertion of the CBH I coding sequence into the pPFE vectors was 
accomplished using two methods. Vector DNA was first produced in 500 mL cultures ofE. coli 
XLl Blue and the plasmids purified using Promega maxi-preps DNA purification kits. 

Approach 1: Blunt-Xba I fragment generation. 

1 . Oligonucleotides were designed to give a blunt end on the 5' end and an engineered Xba I 
site on the 3'end of the PGR fragment. 

2. The full-length coding sequence for CBH I was obtained by PCR using Pfu DNA 
polymerase and using the cDNA construct pB510-2a as the template. Pfu DNA 
polymerase generates blunt-ended PCR products exclusively. 

3. The pPFE vectors were digested using NotI and confirmed by agarose gel 
electrophoresis. The NotI overhang was then digested using Mung Bean nuclease. The 
DNA was purified and the vector and CBH I PCR fragment digested using XbaL 

4. The vector and PCR product were then ligated using T4 DNA ligase and the DNA used 
to transform E. coli XL-1 Blue and E, coli DH5a using electroporation. 

Approach 2: Notl-Xbal fragment approach. , 

1 . Oligonucleotides were designed to give a Not 1 site on the 5* end, and an engineered Xba 
I site on the 3' end of the PCR fragment. 

2. The full-length coding sequence for CBH I was obtained by PCR using Pfu DNA 
polymerase and using the cDNA construct pB510-2a as the template. 

3. The pPFE vectors and the PCR product were digested using Not 1 and Xba 1 

4. The CBH I PCR product was directionally cloned into the pPFE2 vector using T4 DNA 
ligase and transformed into E. coli XL-1 Blue. 

5. The insertion of the CBH I coding sequence into the pPFE2 vector was confirmed using 
PCR, restriction digest analysis, and DNA sequencing through the insertion sites. The 
entire coding sequence of the insert was also confirmed by DNA sequencing. 
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The constructs produced using these two methods was then used to transform A. awamori 
and to express rCBH I, as confirmed by western blot analysis of culture supernatant. The rCBH I 
expressed in A, awamori tends to be over glycosylated as evidenced by the higher molecular 
weight observed on western blot analysis. Over glycosylation of CBH I by A. awamori was 
confirmed by digestion of. the recombinant protein with endoglycosidases. Following 
endoglycosidase H and F digestion, the higher molecular weight form of the protein collapses to 
a molecular weight similar to native CBH 1. 

Example 3> Method for producing PCR site directed mutations for glvcosvlation removal 
and improved thermalstability. 

The QuickChange™ Site Directed Mutagenesis kit (StrataGene, San Diego, CA) was 
used to generate mutants with targeted amino acid substitutions. To introduce these specific 
amino acid substitutions, mutagenic primers (between 25 and 45 bases in length) were designed 
to contain the desired mutation that would result in the targeted amino acid substitution. Pfu 
DNA polymerase was then used to amplify both strands of the double-stranded vector, which 
contained the CBH I insertion sequence, with the resultant inclusion of the desired mutation from 
the synthetic oligonucleotides. Following temperature cycling, the product was treated with the 
exonuclease Dpn I to digest the parental methylated DNA template and the PCR product was 
used to transform Epicurian Coli XL 1 -Blue supercompetent cells. 

The vector pPFE2/CBHI requires a relatively long PCR reaction (8.2 kB) to make site- 
specific changes using the Stratagene Quik Change protocol. The PCR reaction was optimized as 
follows using a GeneAmp PCR System 2400, Perkin Elmer Corporation. The reaction mixture 
contained 50 ng of template DNA, 125 ng each of the sense and antisense mutagenic primers, 
5 mL of Stratagene lOX cloned Pfu buffer, 200 \iM of each: dNTP, 5 mM MgCh (total final 
concentration of MgCh is 7 mM); and 2.5 U Pfu Turbo DNA polymerase. The PCR reaction was 
carried out for 30 cycles, each consisting of one minute denaturation at 96°C, 1 minute annealing 
at 69°C and a final extension for 10 min at 75°C, followed by a hold at 4°C. Agarose gel 
electrophoresis, ethidium bromide staining, and visualization under UV transillumination were 
used to confirm the presence of a PCR product. 

PCR products were digested with the restriction enzyme Dpnl, to degrade un- 
mutagenized parental DNA, and transformed into E. coli (Stratagene Epicurian CoU 
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Supercompetent XL-1 Cells). Ampicillin resistant colonies were picked from LB-amplOO plates 
and mutations were confirmed by DNA sequencing. 

Template DNA from E. coli XLl-blue cells transformed with Dpnl treated mutaginzed 
DNA was prepared for sequencing using the QIAprep-spin plasmid purification mini-prep 
procedure (Qiagen, hic). The transformed XLl-blue cells where grown overnight in 5 mL of LB 
broth with 100 jig/mL ampicillin selection. Cells were removed by centrifiigation and the 
plasmid isolated using the protocol outlined in the QIAprep-spin handbook. The concentration of 
the template DNA was adjusted to 0.25 ^g/^L and shipped along with sequencing 
oligonucleotides to the DNA Sequencing Facility at Iowa State University. 

After the mutation was confirmed by DNA sequence alignment comparisons using the 
software package OMIGA, and the DNA was prepared for transformation of ^. awamori. The 
transformed E, coli XL 1 /blue cells were grown overnight on LB plates with 100 ^ig/mL 
ampicillin at 37°C. A single colony was then used to inoculate a 1 L baffled Erlenmeyer flask 
that contained 500 mL of LB broth and 100 jxg/mL ampicillin. The culture was allowed to grow 
for 16 to 20 hours at 37*^C with 250 rpm shaking in a NBS reciprocating shaking incubator. The 
cells were harvested and the plasmid DNA purified using a Promega maxi-prep purification kit. 
The purified maxi-prep DNA was subsequently used to transform A, awamori spheroplasts using 
the method described below. 

Transformation of Aspergillus awaworiwith Tricboderma reeseiCSR I coding sequence. 
Generating Fungal Spheroplasts. 

A. awamori spheroplasts were generated from two-day-old cultures of mycelia pellets. A 
heavy spore suspension was inoculated into 50 mL of CM broth (5.0 g/L yeast extract; 5.0 g/L 
tryptone; 10 g/L glucose; 50 mL/L 20X Clutterbuck's salts, pH 7.5 (adjusted by addition of 2.0N 
NaOH)) and grown at 225 rpm and 28°C in a baffled 250 mL Erlenmeyer flask. The mycelia 
were collected by filtration through Miracloth and washed with --200 mL KCM (0.7M KCl; 
lOmM MOPS pH 5.8). The washed mycelia were transferred to 50 mL, of KCM + 500 mg 
Novazym 234 in a 50-mL unbaffled flask and incubated 0/N at SOrpm and 30°C. After 
digestion, the remaining mycelia was removed by filtration through Miracloth and the 
spheroplasts were collected in 50 mL disposable tubes and pelleted at 2500 x g in a swinging 
bucket rotor for 15 minutes. The supernatant was discarded and the spheroplasts genfly 
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resuspended in 20 mL 0.7M KCl by tituration with a 25-mL disposable pipet. The spheroplasts 
were pelleted and washed again, then resuspended in 10 mL KC (0.7M KCl + 50mM CaCh). 
After being pelleted, the spheroplasts were resuspended into 1.0 mL of KC. 

Transformation was carried out using 50 ^L of spheroplasts + 5 ^iL DNA (pPFEl or 
pPFE2 ^200 ng/mL) + 12.5^^L PCM (40% PEG8000 + 50mM CaC12 + 10 mM MOPS pH 5.8). 
After incubation for 60 mins on ice, 0.5mL PCM was added and the mixture was incubated for 
45 mins at room temperature. One milliliter of KCl was added and 370 jiL of the mix was added 
to 10 mL of molten CMK (CM + 2% agar + 0.7M KCl) top agar at 55°C. This mixture was 
immediately poured onto a 15mL CM 170 plate (CM + 2% agar + 170^g/mL Zeocin). Negative 
transformation controls substituted sterile dH20 for DNA. Plating the transformation mix onto 
CM plates without Zeocin performed positive spheroplast regeneration controls. The poured 
plates were incubated at 28°C in the dark for 2-7 days. 

Transformation of Aspergillus awamori y/iih native and modified CBH I coding sequence, 

Aspergillus awamori spore stocks were stored at -70°C in 20% glycerol, 10% lactose. 
After thawing, 200 jiL of spores were inoculated into 50 mL CM broth in each of eight-baffled 
250 mL Erlenmeyer flask. The cultures were grown at 28°C, 225 rpm for 48 h. The mycelial 
balls were removed by filtration with sterile Miracloth (Calbiochem, San Diego, CA) and 
washed thoroughly with sterile KCM. Approximately 10 g of washed mycelia were transferred to 
50 mL KCM + 250 mg Novozym234 in a 250 mL baffled Erlenmeyer flask. The digestion 
mixture was incubated at 30°C, 80 rpm for 1-2 h and filtered through Miracloth into 50 mL 
conical centrifuge tubes. The spheroplasts were pelleted at 2000xg for 15 min and resuspended 
in 0.7M KCl by gentle tituration with a 25 mL pipette. This was repeated once. After a third 
pelleting, the spheroplasts were resuspended in 10 mL KC, pelleted and resuspended in 0.5 mL 
KC using a wide-bore pipet tip. The washed spheroplasts were transformed by adding 12.5 ^L 
PCM and 5 |liL DNA ('--0.5 (ig/jxL) to 50 ^iL of spheroplasts in sterile 1.5 mL Eppendorf tubes. 
After incubation on ice for 45 minutes, 0.5 mL of room temperature PCM was added to the 
transformation mixture and was mixed by tituration with a wide bore pipet tip. The mixture was 
incubated at room temperature for 45 minutes. One milliliter of KC was added and mixed. The 
mixture was allocated between four tubes of CM top agar at 55°C, which were each poured over 
a 15 mL CM 170 plate. The plates were incubated at 28''C for 2-3 days. Subsurface colonies were 



10 

Application Serial # 10/031,496 



Appendix A 

Patent 

Attorney Docket # NREL 99-45 

partially picked with a sterile wide bore pipet tip, exposing the remaining part of the colony to air 
and promoting rapid sporulation. After sporulation, spores were streaked onto several successive 
CMIOO or CM300 plates. After a monoculture was established, heavily sporulated plates were 
flooded with sterile spore suspension medium (20% glycerol, 10% lactose), the spores were 
suspended and aliquots were frozen at -70°C. Working spore stocks were stored on CM slants in 
screw cap tubes at 4°C. Protein production was confirmed and followed by western blot using 
anti-CBH I monoclonal antibodies and the Novex Western Breeze anti-mouse chromogenic 
detection kit (Novex, San Diego, CA). Extracting genomic DNA using the YeaStar Genomic 
DNA Kit (Zymo Research, Orange, CA) and carrying out PGR with pfu-Uxxho DNA polymerase 
(Stratagene, La JoUa) and cbh J primers confirmed insertion of the gene. 

Production and purification of native rCBH I enzyme from AsDereillus awamori. 

For enzyme production, spores were inoculated into 50 mL CM basal starch medium, pH 
7.0, and grown at 32°C, 225 rpm in 250 mL baffled flasks. The cultures were transferred to LO L 
of basal starch medium in 2800 mL Fembach flasks and grown under similar conditions. For 
large-scale enzyme production (>1 mg), these cultures were transferred to 10 L basal starch 
medium in a New Brunswick BioFlo3000 fermenter (10-L working volume) maintained at 20% 
DO, pH 7.0, 25°G, and 300 rpm. The fermentation was harvested by filtration through Miracloth 
after 2-3 days of growth. 

After fiirther clarification by glass fiber filtration, the rCBH I protein was purified by 
passing the fermentation broth over four GBinD900 cartridge columns (Novagen, Madison, WI) 
connected in parallel using a Pharmacia FPLG System loading at 1.0 mL/min (Amersham 
Pharmacia Biotech, Inc., Piscataway, NJ). The cartridges were equilibrated in 20 mM Bis-Tris 
pH 6.5 prior to loading and washed with the same buffer after loading. The bound rCBH I was 
then eluted with 100% ethylene glycol (3 mL/column) using a syringe. Altematively, the 
supernatant was passed over a para-aminophenyl p-D-cellobioside affinity column, washed with 
100 mM acetate buffer, pH 5.0, ImM gluconolactone and eluted in the same buffer containing 
1 OmM cellobiose. In either method, the eluted rCBH I was concentrated in Millipore Ultrafree- 
15 spin concentrator with a lOkDa Biomax membrane to <2.0 mL and loaded onto a Pharmacia 
SuperDex200 16/60 size-exclusion column. The mobile phase was 20 mM sodium acetate, 100 
mM sodium chloride, and 0.02% sodium azide, pH 5.0 running at 1.0 mL/min. The eluted 
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protein was concentrated and stored at 4°C. Protein concentrations were determined for each 
mutant based upon absorbance at 280 nm and calculated from the extinction coefficient and 
molecular weight for each individual protein as determined by primary amino acid sequence 
using the ProtParam tool on the ExPASy website ( http ://www.exDasv. ch/tools/protparam.htmn . 

Clutterbuck's Salts (20X) 

Na2N03 120.0 g 

KCl 10.4 

MgS04»7H20 10.4 

KH2PO4 30.4 

CM- 

Yeast Extract- 5 g/L 

Tryptone- 5 g/L 

Glucose- 10 g/L 

Clutterbuck's Salts- 50 mL 

Add above to 900 mL dHzO, pH to 7.5, bring to 1000 mL 

CM Agar -CM + 20g/L Agar 

CMK -CM Agar + 0.7M KCl 

CMIOO -CM + 100 ng/mL Zeocin (hivitrogen, Carlsbad, CA) 
CM170 -CM + 1 70 ng/mL Zeocin, 1 5mL/plate 
KC1-0.7MKC1 



KC-0.7M KCl + 50 mM CaCh 

KCM - 0.7M KCl + lOmM MOPS, pH 5.8 

PCM -40% PEG 8000, 50 mM CaCb, 10 mM MOPS pH 5.8 

(mix 4 mL 50% PEG + 0.5 mL 500 mM CaCh stock + 0.5 mL 100 mM MOPS stock) 

Basal Starch Medium - 

Casein Hydrolysate, Enzymatic 5 g/L 



NH4CL 5 g/L 

Yeast Extract 10 g/L 

Tryptone 10 g/L 

MgS04 *7H20 2 g/L 

Soluble Starch 50 g/L 

Buffer (Bis-Tris-Propane) 50 mM 
pH to 7.0 with NaOH 
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Example 4. Production of Reduced Glvcosvlation rCBH I: Sites N270A; N45A; and 
N384A. 

rCHI/pPFE2 has been optimized using site-directed mutagenisis to achieve expression of 
native molecular weight CBH I in A. awamori by the following ways. The QuickChange SDM 
kit (Stratagene, San Diego, CA) was used to make point mutations, switch amino acids, and 
delete or insert amino acids in the native cbh 1 gene sequence. The Quick Change SDM 
technique was performed using thermotolerant Pfu DNA polymerase, which replicates both 
plasmid strands with high fidelity and without displacing the mutant oligonucleotide primers. 
The procedure used the polymerase chain reaction (PGR) to modify the cloned cbh 1 DNA. The 
basic procedure used a supercoiled double stranded DNA (dsDNA) vector, with the cbh 1 gene 
insert, and two synthetic oligonucleotide primers containing a desired mutation. The 
oligonucleotide primers, each complimentary to opposite strands of the vector, extend during 
temperature cycling by means of the polymerase. On incorporation of the primers, a mutated 
plasmid containing the desired nucleotide substitutions was generated. Following temperature 
cycling, the PGR product was treated with a Dpnl restriction enzyme. Dpnl is specific for 
methylated and hemi-methylated DNA and thus digests the unmutated parental DNA template, 
selecting for the mutation-containing, newly synthesized DNA. The nicked vector DNA, 
containing the desired mutations, was then transformed into E, coli. The small amount of 
template DNA required to perform this reaction, and the high fideUty of the Pfu DNA 
polymerase contribute to the high mutation efficiency and minimizes the potential for the 
introduction of random mutations. Three glycosylation-site amino acids on the pro surface were 
targeted for substitution of an alanine (A) residue in place of asparagines (N). Single site 
substitutions were successfiiUy completed in the cbh 1 coding sequence at sites N45, N270, and 
N384, of Seq. ID NO: 4 by site-directed mutagenisis, and confirmed by DNA sequencing. 

Double and triple combinations of this substitution have also been completed in the cbh 1 
coding sequence at sites N45, N270, and N384 by site directed mutagenisis and confirmed by 
DNA sequencing. These double and triple site constructs also yield rGBH I enzymes with 
reduced glycosylation and, presumably, native activity. 



Table 1. 



Construct 


Host 


MW (kDa) 


Km 


Vmax 










l^mol pNP/min/mg 
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rCBHI N270/45A 


A. awamori 


58.3 






rCBHI N384/270A 


A, awamori 


58.8 







As shown in Table 1 , Western blot analysis of the supernatant, obtained from a single 
glycosylation site mutant CBHIN270A culture expressed in A, awamori^ demonstrated that a 
decrease, to lower molecular weight (61.7 kDa), in the amount of glycosylation of the protein 
had occurred, as compared to that in the wild type cDNA (63.3 kDa), and the wild type genomic 
DNA (63.3 kDa). These results demonstrate a reduction in the level of glycosylation in the 
reduced glycosylation mutant CBHIN270A, via expression in A. awamori. It is also shown, in 
the Table, that the CBHIN270A enzyme nearly retained its native enzymatic activity when 
assayed using the pNPL substrate. The variants CBHIN45A and CBHI384A also demonstrate a 
reduction in amount of glycosylation and native activity when expressed from the heterologous 
host A, awamori and when combined in the double mutations CBHIN270/45A and 
CBHIN270/384A reduce the level of glycosylation further. 

Example 5, Amino Acid Mutations Tar2ete(l To Improve Thermal Tolerance Of CBH I 
Helix Capping Mutants. 

All a-helices display dipole moments, i.e. positive at N-terminal and negative at 
terminal. Compensation for such dipole moments (capping) has been observed in a number of 
protein structures ^ and has been shown to improve the protein stability. For example, the 
introduction of a negatively charged amino acid at the N-terminus and a positively charged 
amino acid at C-terminus of an a-helix increased the thermostability of T4 lysozyme^ and hen 
lysozyme"*, via an electrostatic interaction with the "helix dipole." Five amino acid sites were 
identified for helix capping (see Table 5). 
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Peptide Strain Removal Mutants. 

A small fraction of residues adopt torsion angles, phi-psi angles, which are unfavorable. 
It has been shown that mutation of such residues to Gly increased the protein stability as much as 
4 kcal/mol. One amino acid site was selected for peptide strain removal (see Table 3). 

Helix Propensity Mutants. 

Two amino acid sites were selected for helix propensity improvement. 

Disulfide Bridge Mutants. 

Disulfide bonds introduced between amino acid positions 9 and 164 and between 21 and 
142 in phage T4 lysozyme have been shown to significantly increase the stability of the 
respective enzymes toward thermal denaturation . The engineered disulfide bridge between 
residues 197 and 370 of CBH I should span the active site cleft and enhance its thermostability. 
The active site of CBH I is in a tunnel. The roof over the tunnel appears to be fairly mobile (high 
temperature-factors). At an elevated temperature the mobility of the tunnel is too significant to 
position all the active site residues. The disulfide linkage should stabilize the roof of the tunnel 
making the enzyme a consistent exocellulase even at a high temperature. Two amino acid sites 
were identified for new disulfide bridge generation. 

Deletion Mutants, 

Thermostable proteins have shorter loops that cormect their structural elements than 
typical proteins. Our sequence alignment of CBH I, with its close homologs, suggests that the 
following residues may be deleted without significantly affecting its function. These loops 
exhibited high mobility as well. Three loops were identified, but these modifications were 
considered high risk (buried hydrophobic regions may be exposed to solvent upon deletion of a 
natural loop) and will be saved for fiiture work. 

Proline Replacement Mutants, 

The unique structure of proline dictates that fewer degrees of fi-eedom are allowed around 
the alpha carbon that most other amino acids. The result of this structure is that peptides tend to 
loose flexibility in regions rich with proline. In order to assess possible sites for replacement of 
existing amino acids with proline, the phi/psi angles of candidate amino acid sites must conform 
with those consistent with proline. Each new site must also be evaluated for allowable side chain 
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interactions and assurance that interactions with substrate are not altered. Seventeen amino acid 
sites were identified for proline replacement (See Table 2). 

Example 6. Nucleic acid sequence of a variant exoglucance. 

The present example demonstrates the utility of the present invention for providing a 
nucleic acid molecule having a nucleic acid sequence that has a sequence 5'- 
GGCGGAAACCCGCCIGGCACCACC-3' (SEQ ID NO: 3). The identified nucleic acid 
sequence presents a novel linker region nucleic acid sequence that differs from previously 
reported nucleic acid sequence by the addition of one codon, and the alteration of an adjacent 
codon, both encoding a proline (See Figure 4). The invention in some aspects thus provides a 
nucleic acid molecule encoding a cellobiohydrolase that comprises a linker region of about 6 to 
20 amino acids in length as identified here. 



Table 2. Proline mutations to improve thermal tolerance. 



Mutation 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 10 

S8P - native sense strand 


5'-GCACTCTCCAATCGGAGACTCACCCG-3' 


SEQ ID NO: 1 1 
Mutagenic sense strand 


5'-GCACTCTCCAACCGGAGACTCACCCG-3' 


SEQ ID NO: 12 
Mutagenic anti-sense strand 


5*-CGGGTGAGTCTCCGGTTGGAGAGTGC-3' 




SEQ ID NO: 13 

N27P - native sense strand 


5'-GGCACGTGCACTCAACAGACAGGCTCCG-3* 


SEQ ID NO: 14 
Mutagenic sense strand 


5'-GGCACGTGCACTCCACAGACAGGCTCCG-3' 


SEQ ID NO: 15 
Mutagenic anti-sense strand 


5'-CGGAGCCTGTCTGTGGAGTGCACGTGCC-3* 




SEQ ID NO: 16 

A43P ' native sense strand 


5'-GGCGCTGGACTCACGCTACGAACAGCAGCACG-3' 


SEQ ID NO: 17 
Mutagenic sense strand 


5'-GGCGCTGGACTCACCCTACGAACAGCAGCACG-3' 


SEQ ID NO: 18 
Mutagenic anti-sense strand 


5'-CGTGCTGCTGTTCGTAGGGTGAGTCCAGCGCC-3' 




SEQ ID NO: 19 

G75P - native sense strand 


5'-GCTGTCTGGACGGTGCCGCCTACGCG-3' 


SEQ ID NO: 20 
Mutagenic sense strand 


5'-GCTGTCTGGACCCTGCCGCCTACGCG-3' 


SEQ ID NO: 21 
Mutagenic anti-sense strand 


5'-CGCGTAGGCGGCAGGGTCCAGACAGC-3' 
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SEQ ID NO: 22 

G94P - native sense strand 


5'-GCCTCTCCATTGGCTTTGTCACCC.3' 


SEQ ID NO: 23 
Mutagenic sense strand 


5*-GCCTCTCCATTCCCTTTGTCACCC-3' 


SEQ ID NO: 24 
Mutagenic anti-sense strand 


5'-GGGTGACAAAGGGAATGGAGAGGC-3' 




SEQ ID NO: 25 

EI90P - native sense strand 


5*-GGCCAACGTTGAGGGCTGGGAGCC-3' 


SEQ ID NO: 26 

Mutagenic sense strand 


5'-GGCCAACGTTCCGGGCTGGGAGCC-3' 


SEQ ID NO: 27 
Mutagenic anti-sense strand 


5'-GGCTCCCAGCCCGGAACGTTGGCC-3' 




SEQ ID NO: 28 

SI95P - native sense strand 


5'-GGCTGGGAGCCGTCATCCAACAACGCG-3' 


SEQ ID NO: 29 
Mutagenic sense strand 


5'-GGCTGGGAGCCGCCATCCAACAACGCG-3' 


SEQ ID NO: 30 
Mutagenic anti-sense strand 


5'-CGCGTTGTTGGATGGCGGCTCCCAGCC-3' 




SEQ ID NO: 31 

K287P - native sense strand 


5'-CGATACCACCAAGAAATTGACCGTTGTCACCC-3* 


SEQ ID NO: 32 
Mutagenic sense strand 


5'-CGATACCACCAAGCCATTGACCGTTGTCACCC-3' 


SEQ ID NO: 33 
Mutagenic anti-sense strand 


5*-GGGTGACAACGGTCAATGGCTTGGTGGTATCG-3' 




SEQ ID NO: 34 

A299P ' native sense strand 


5'-CGAGACGTCGGGTGCCATCAACCGATAC-3' 


SEQ ID NO: 35 
Mutagenic sense strand 


5'-CGAGACGTCGGGTCCCATCAACCGATAC-3' 


SEQ ID NO: 36 
Mutagenic anti-sense strand 


5'-GTATCGGTTGATGGGACCCGACGTCTCG-3' 




SEQ ID NO: 37 

Q3I2P/N3J5P - native sense strand 


5*-GGCGTCACTTTCCAGCAGCCCAACGCCGAGCTTGG-3' 


SEQ ID NO: 38 
Mutagenic sense strand 


5'-GGCGTCACTTTCCCGCAGCCCCCCGCCGAGCTTGG-3* 


SEQ ID NO: 39 
Mutagenic anti-sense strand 


5'-CCAAGCTCGGCGGGGGGCTGCGGGAAAGTGACGCC-3' 




SEQ ID NO: 40 

G359P - native sense strand 


5'-GGCTACCTCTGGCGGCATGGTTCTGG-3* 


SEQ ID NO: 41 
Mutagenic sense strand 


5'-GGCTACCTCTCCCGGCATGGTTCTGG-3' 


SEQ ID NO: 42 


5'-CCAGAACCATGCCGGGAGAGGTAGCC-3' 
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Mutagenic anti-sense strand 




SEQ ID NO: 43 

S398P/S401 P' native sense strand 


5'-GCGGAAGCTGCTCCACCAGCTCCGGTGTCCCTGC-3' 


SEQ ID NO: 44 
Mutagenic sense strand 


5'-GCGGAAGCTGCCCCACCAGCCCCGGTGTCCCTGC-3' 


SEQ ID NO: 45 
Mutagenic anti-sense strand 


5'-GCAGGGACACCGGGGCTGGTGGGGCAGCTTCCGC-3' 




SEQ ID NO: 46 

A414P ' native sense strand 


5'-GTCTCCCAACGCCAAGGTCACC-3' 


SEQ ID NO: 47 
Mutagenic sense strand 


5'-GTCTCCCAACCCCAAGGTCACC-3' 


SEQ ID NO: 48 
Mutagenic anti- sense strand 


5'-GGTGACCTTGGGGTTGGGAGAC-3' 




SEQ ID NO: 49 

N43IP/S433P - native sense strand 


5'-GGCAGCACCGGCAACCCTAGCGGCGGCAACCC-3' 


SEQ ID NO: 50 
Mutagenic sense strand 


5'-GGCAGCACCGGCCCCCCTCCCGGCGGCAACCC-3' 


SEQ ID NO: 51 
Mutagenic anti-sense strand 


5'-GGGTTGCCGCCGGGAGGGGGGCCGGTGCTGCC-3' 


Table 3. Mutation to remove peptide strain. 


Mutation site 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 52 

S99G- native sense strand 


5'-GGCTTTGTCACCCAGTCTGCGCAGAAGAACGTTGGC-3' 


SEQ ID NO: 53 
Mutagenic sense strand 


5'-GGCTTTGTCACCCAGGGTGCGCAGAAGAACGTTGGC-3' 


SEQ ID NO: 54 
Mutagenic anti-sense strand 


5'-GCCAACGTTCTTCTGCGCACCCTGGGTGACAAAGCC-3* 


Table 3b. Y245G analogs to remove product inhibition. 


Mutation site 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 55 

R25IA - native sense strand 


5'-CCGATAACAGATATGGCGGC-3' 


SEQ ID NO: 56 
Mutagenic sense strand 


5*-CCGATAACGCCTATGGCGGC-3' 


SEQ ID NO: 57 
Mutagenic anti-sense strand 


5'-GCCGCCATAGGCGTTATCGG-3' 




SEQ ID NO: 58 

R394A' native sense strand 


5'-CCCGGTGCCGTGCGCGGAAGCTGCTCCACC-3' 


SEQ ID NO: 59 . 

Mutagenic sense strand 


5'-CCCGGTGCCGTGGCCGGAAGCTGCTCCACC-3' 


SEQ ID NO: 60 


5'-GGTGGAGCAGCTTCCGGCCACGGCACCGGG-3' 
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Mutagenic anti-sense strand 




SEQ ID NO: 61 


5'-GCTGAGGAGGCAGAATTCGGCGGATCCTCTTTCTC-3' 


F338A- native sense strand 




SEQ ID NO: 62 


5'-GCTGAGGAGGCAGAAGCCGGCGGATCCTCTTTCTC-3' 


Mutagenic sense strand 




SEQ ID NO: 63 


5'-GAGAAAGAGGATCCGCCGGCTTCTGCCTCCTCAGC-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 64 


5'-GGAACCCATACCGCCTGGGCAACACCAGC-3' 


R267A- native sense strand 




SEQ ID NO: 65 


5'-GGAACCCATACGCCCTGGGCAACACCAGC-3' 


Mutagenic sense strand 




SEQ ID NO: 66 


5'-GCTGGTGTTGCCCAGGGCGTATGGGTTCC-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 67 


5*-CCTACCCGACAAACGAGACCTCCTCCACACCCGG-3' 


E385A- native sense strand 




SEQ ID NO: 68 


5'-CCTACCCGACAAACGCCACCTCCTCCACACCCGG-3^ 


Mutagenic sense strand 




SEQ ID NO: 69 


5'-CCGGGTGTGGAGGAGGTGGCGTTTGTCGGGTAGG-3' 


Mutagenic anti-sense strand 
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Table 4. N to A mutations to remove glycosylation. 



Mutant 


Native sequence and mutagenic oligonucleotide 


SEQ ID NO: 70 

N45A - native sense strand 


5'-GGACTCACGCTACGAACAGCAGCACGAACTGC-3' 


SEQ ID NO: 71 
Mutagenic sense strand 


5'-GGACTCACGCTACGGCCAGCAGCACGAACTGC-3' 


SEQ ID NO: 72 • 
Mutagenic anti-sense strand 


5»-GCAGTTCGTGCTGCTGGCCGTAGCGTGAGTCC-3' 




SEQ ID NO: 73 

N270A - native sense strand 


5'-CCCATACCGCCTGGGCAACACCAGCTTCTACGGCCC-3' 


SEQ ID NO: 74 
Mutagenic sense strand 


5'-CCCATACCGCCTGGGCGCCACCAGCTTCTACGGCCC-3' 


SEQ ID NO: 75 . 
Mutagenic anti-sense strand 


5'-GGGCCGTAGAAGCTGGTGGCGCCCAGGCGGTATGGG-3' 




SEQ ID NO: 76 

N384A - native sense strand 


5'-GGACTCCACCTACCCGACAAACGAGACCTCCTCCACACCCG-3' 


SEQ ID NO: 77 
Mutagenic sense strand 


5'-GGACTCCACCTACCCGACAGCCGAGACCTCCTCCACACCCG-3' 


SEQ ID NO: 78 
Mutagenic anti-sense strand 


5'-CGGGTGTGGAGGAGGTCTCGGCTGTCGGGTAGGTGGAGTCC-3' 


Table 5. Helix capping mutations to improve thermal tolerance. 


Mutant 


Native sequence and mutagenic oligonucleotide 


SEQ ID NO: 79 

E337R - native sense strand 


5'-GCTGAGGAGGCAGAATTCGGCGG-3' 


SEQ ID NO: 80 
Mutagenic sense strand 


5'-GCTGAGGAGGCACGCTTCGGCGG-3' 


SEQ ID NO: 81 
Mutagenic anti-sense strand 


5'-CCGCCGAAGCGTGCCTCCTCAGC-3' 




SEQ ID NO: 82 

N327D - native sense strand 


5'-GGCAACGAGCTCAACGATGATTACTGC-3' 


SEQ ID NO: 83 
Mutagenic sense strand 


5'-GGCAACGAGCTCGACGATGATTACTGC-3' 


SEQ ID NO: 84 
Mutagenic anti-sense strand 


5'-GCAGTAATCATCGTCGAGCTCGTTGCC-3' 




SEQ ID NO: 85 

A405D - native sense strand 


5'-CCGGTGTCCCTGCTCAGGTCGAATCTCAGTCTCCC-3' 


SEQ ID NO: 86 
Mutagenic sense strand 


5'-CCGGTGTCCCTGATCAGGTCGAATCTCAGTCTCCC-3* , 


SEQ ID NO: 87 
Mutagenic anti-sense strand 


5'-GGGAGACTGAGATTCGACCTGATCAGGGACACCGG-3' 
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SEQ ID NO: 88 5'-GCTCAGGTCGAATCTCAGTCTCCCAACGCC-3* 

Q410R - native sense strand 

SEQ ID NO: 89 5'-GCTCAGGTCGAATCTCGCTCTCCCAACGCC>3' 

Mutagenic sense strand 

SEQ ID NO: 90 5'-GGCGTTGGGAGAGCGAGATTCGACCTGAGC-3' 

Mutagenic anti-sense strand 



SEQ ID NO: 9 1 5*-CCCTATGTCCTGACAACGAGACCTGCGCG-3' 

N 64D - native sense strand 

SEQ ID NO: 92 5'-CCCTATGTCCTGACGACGAGACCTGCGCG-3' 

Mutagenic sense strand 

SEQ ID NO: 93 5'-CGCGCAGGTCTCGTCGTCAGGACATAGGGO' 
Mutagenic anti-sense strand . 



SEQ ID NO: 94 5'-GCTCGACCCTATGTCCTGACAACGAGACCTGCGCGAAGAACTGC-3' 

N64D - native sense strand 

SEQ ID NO: 95 5'-GCTCGACCCTATGTCCTGACGACGAGACCTGCGCGAAGAACTGC-3' 

Mutagenic sense strand \ 

SEQ ID NO: 96 5'-GCAGTTCTTCGCGCAGGTCTCGTCGTCAGGACATAGGGTCGAGC-3' 
Mutagenic anti-sense strand 

Legend for Tables 2, 3, 3b, 4 and 5. Amino acid mutations sites are listed in the left column. The 
first letter in the designation is the amino acid of the native protein based upon lUPAC 
convention for one-letter codes for amino acids. The number represents the amino acid location 
as designated from the start of the mature protein (excluding the signal peptide, i.e. QSA...). The 
letter designation after the number represents the amino acid that will occur as a result of the 
mutation. For example N64D represents the asparagine at site 64 changed to an aspartic acid. 
The native sense strand sequence for each site is listed in the right column with the 
oligonucleotide primers (sense and anti-sense) used to obtain the desired mutation below the 
native sequence in each case. In addition the codon for the targeted amino acid is bolded and the 
niicleotide substitutions in the mutagenic primers underlined. In some cases only one nucleotide 
substitution was required the make the desired change, and in others 2 or 3 substitutions were 
required. In a few cases, double mutations were made with a single mutagenic oligonucleotide. 
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